53 research outputs found

    Genetic Variation in Human Gene Regulatory Factors Uncovers Regulatory Roles in Local Adaptation and Disease

    Get PDF
    Differences in gene regulation have been suggested to play essential roles in the evolution of phenotypic changes. Although DNA changes in cis-regulatory elements affect only the regulation of its corresponding gene, variations in gene regulatory factors (trans) can have a broader effect, because the expression of many target genes might be affected. Aiming to better understand how natural selection may have shaped the diversity of gene regulatory factors in human, we assembled a catalog of all proteins involved in controlling gene expression. We found that at least five DNA-binding transcription factor classes are enriched among genes located in candidate regions for selection, suggesting that they might be relevant for understanding regulatory mechanisms involved in human local adaptation. The class of KRAB-ZNFs, zinc-finger (ZNF) genes with a Krüppel-associated box, stands out by first, having the most genes located on candidate regions for positive selection. Second, displaying most nonsynonymous single nucleotide polymorphisms (SNPs) with high genetic differentiation between populations within these regions. Third, having 27 KRAB-ZNF gene clusters with high extended haplotype homozygosity. Our further characterization of nonsynonymous SNPs in ZNF genes located within candidate regions for selection, suggests regulatory modifications that might influence the expression of target genes at population level. Our detailed investigation of three candidate regions revealed possible explanations for how SNPs may influence the prevalence of schizophrenia, eye development, and fertility in humans, among other phenotypes. The genetic variation we characterized here may be responsible for subtle to rough regulatory changes that could be important for understanding human adaptation

    Species-Specific Changes in a Primate Transcription Factor Network Provide Insights into the Molecular Evolution of the Primate Prefrontal Cortex

    Get PDF
    The human prefrontal cortex (PFC) differs from that of other primates with respect to size, histology, and functional abilities. Here, we analyzed genome-wide expression data of humans, chimpanzees, and rhesus macaques to discover evolutionary changes in transcription factor (TF) networks that may underlie these phenotypic differences. We determined the co-expression networks of all TFs with species-specific expression including their potential target genes and interaction partners in the PFC of all three species. Integrating these networks allowed us inferring an ancestral network for all three species. This ancestral network as well as the networks for each species is enriched for genes involved in forebrain development, axonogenesis, and synaptic transmission. Our analysis allows us to directly compare the networks of each species to determine which links have been gained or lost during evolution. Interestingly, we detected that most links were gained on the human lineage, indicating increase TF cooperativity in humans. By comparing network changes between different tissues, we discovered that in brain tissues, but not in the other tissues, the human networks always had the highest connectivity. To pinpoint molecular changes underlying species-specific phenotypes, we analyzed the sub-networks of TFs derived only from genes with species-specific expression changes in the PFC. These sub-networks differed significantly in structure and function between the human and chimpanzee. For example, the human-specific sub-network is enriched for TFs implicated in cognitive disorders and for genes involved in synaptic plasticity and cognitive functions. Our results suggest evolutionary changes in TF networks that might have shaped morphological and functional differences between primate brains, in particular in the human PFC

    Temporal ordering of substitutions in RNA evolution : uncovering the structural evolution of the human accelerated region 1

    Get PDF
    The Human Accelerated Region 1 (HAR1) is the most rapidly evolving region in the human genome. It is part of two overlapping long non-coding RNAs, has a length of only 118 nucleotides and features 18 human specific changes compared to an ancestral sequence that is extremely well conserved across non-human primates. The human HAR1 forms a stable secondary structure that is strikingly different from the one in chimpanzee as well as other closely related species, again emphasizing its human-specific evolutionary history. This suggests that positive selection has acted to stabilize human-specific features in the ensemble of HAR1 secondary structures. To investigate the evolutionary history of the human HAR1 structure, we developed a computational model that evaluates the relative likelihood of evolutionary trajectories as a probabilistic version of a Hamiltonian path problem. The model predicts that the most likely last step in turning the ancestral primate HAR1 into the human HAR1 was exactly the substitution that distinguishes the modern human HAR1 sequence from that of Denisovan, an archaic human, providing independent support for our model. The MutationOrder software is available for download and can be applied to other instances of RNA structure evolution

    Whole transcriptomic network analysis using Co-expression Differential Network Analysis (CoDiNA)

    Get PDF
    Biological and medical sciences are increasingly acknowledging the significance of gene co-expression-networks for investigating complex-systems, phenotypes or diseases. Typically, complex phenotypes are investigated under varying conditions. While approaches for comparing nodes and links in two networks exist, almost no methods for the comparison of multiple networks are available and-to best of our knowledge-no comparative method allows for whole transcriptomic network analysis. However, it is the aim of many studies to compare networks of different conditions, for example, tissues, diseases, treatments, time points, or species. Here we present a method for the systematic comparison of an unlimited number of networks, with unlimited number of transcripts:Co-expression Differential Network Analysis (CoDiNA). In particular, CoDiNA detects linksandnodes that are common, specific or different among the networks. We developed a statistical framework to normalize between these different categories of common or changed network links and nodes, resulting in a comprehensive network analysis method, more sophisticated than simply comparing the presence or absence of network nodes. Applying CoDiNA to a neurogenesis study we identified candidate genes involved in neuronal differentiation. We experimentally validated one candidate, demonstrating that its overexpression resulted in a significant disturbance in the underlying gene regulatory network of neurogenesis. Using clinical studies, we compared whole transcriptome co-expression networks from individuals with or without HIV and active tuberculosis (TB) and detected signature genes specific to HIV. Furthermore, analyzing multiple cancer transcription factor (TF) networks, we identified common and distinct features for particular cancer types. These CoDiNA applications demonstrate the successful detection of genes associated with specific phenotypes. Moreover, CoDiNA can also be used for comparing other types of undirected networks, for example, metabolic, protein-protein interaction, ecological and psychometric networks. CoDiNA is publicly available as anRpackage in CRAN (https://CRAN. R-project.org/package=CoDiNA)

    Multiple haplotype-resolved genomes reveal population patterns of gene and protein diplotypes

    Get PDF
    To fully understand human biology and link genotype to phenotype, the phase of DNA variants must be known. Here we present a comprehensive analysis of haplotype-resolved genomes to assess the nature and variation of haplotypes and their pairs, diplotypes, in European population samples. We use a set of 14 haplotype-resolved genomes generated by fosmid clone-based sequencing, complemented and expanded by up to 372 statistically resolved genomes from the 1000 Genomes Project. We find immense diversity of both haploid and diploid gene forms, up to 4.1 and 3.9 million corresponding to 249 and 235 per gene on average. Less than 15% of autosomal genes have a predominant form. We describe a ‘common diplotypic proteome’, a set of 4,269 genes encoding two different proteins in over 30% of genomes. We show moreover an abundance of cis configurations of mutations in the 386 genomes with an average cis/trans ratio of 60:40, and distinguishable classes of cis- versus trans-abundant genes. This work identifies key features characterizing the diplotypic nature of human genomes and provides a conceptual and analytical framework, rich resources and novel hypotheses on the functional importance of diploidy

    Combined Experimental and System-Level Analyses Reveal the Complex Regulatory Network of miR-124 during Human Neurogenesis

    Get PDF
    Non-coding RNAs regulate many biological processes including neurogenesis. The brain-enriched miR-124 has been assigned as a key player of neuronal differentiation via its complex but little understood regulation of thousands of annotated targets. To systematically chart its regulatory functions, we used CRISPR/Cas9 gene editing to disrupt all six miR-124 alleles in human induced pluripotent stem cells. Upon neuronal induction, miR-124-deleted cells underwent neurogenesis and became functional neurons, albeit with altered morphology and neurotransmitter specification. Using RNA-induced-silencing-complex precipitation, we identified 98 high-confidence miR-124 targets, of which some directly led to decreased viability. By performing advanced transcription-factor-network analysis, we identified indirect miR-124 effects on apoptosis, neuronal subtype differentiation, and the regulation of previously uncharacterized zinc finger transcription factors. Our data emphasize the need for combined experimental- and system-level analyses to comprehensively disentangle and reveal miRNA functions, including their involvement in the neurogenesis of diverse neuronal cell types found in the human brain

    Gain, Loss and Divergence in Primate Zinc-Finger Genes: A Rich Resource for Evolution of Gene Regulatory Differences between Species

    Get PDF
    The molecular changes underlying major phenotypic differences between humans and other primates are not well understood, but alterations in gene regulation are likely to play a major role. Here we performed a thorough evolutionary analysis of the largest family of primate transcription factors, the Krüppel-type zinc finger (KZNF) gene family. We identified and curated gene and pseudogene models for KZNFs in three primate species, chimpanzee, orangutan and rhesus macaque, to allow for a comparison with the curated set of human KZNFs. We show that the recent evolutionary history of primate KZNFs has been complex, including many lineage-specific duplications and deletions. We found 213 species-specific KZNFs, among them 7 human-specific and 23 chimpanzee-specific genes. Two human-specific genes were validated experimentally. Ten genes have been lost in humans and 13 in chimpanzees, either through deletion or pseudogenization. We also identified 30 KZNF orthologs with human-specific and 42 with chimpanzee-specific sequence changes that are predicted to affect DNA binding properties of the proteins. Eleven of these genes show signatures of accelerated evolution, suggesting positive selection between humans and chimpanzees. During primate evolution the most extensive re-shaping of the KZNF repertoire, including most gene additions, pseudogenizations, and structural changes occurred within the subfamily homininae. Using zinc finger (ZNF) binding predictions, we suggest potential impact these changes have had on human gene regulatory networks. The large species differences in this family of TFs stands in stark contrast to the overall high conservation of primate genomes and potentially represents a potent driver of primate evolution

    Divergent evolution in the genomes of closely related lacertids, <i>Lacerta viridis</i> and <i>L. bilineata</i>, and implications for speciation

    Get PDF
    Lacerta viridis and Lacerta bilineata are sister species of European green lizards (eastern and western clades, respectively) that, until recently, were grouped together as the L. viridis complex. Genetic incompatibilities were observed between lacertid populations through crossing experiments, which led to the delineation of two separate species within the L. viridis complex. The population history of these sister species and processes driving divergence are unknown. We constructed the first high-quality de novo genome assemblies for both L. viridis and L. bilineata through Illumina and PacBio sequencing, with annotation support provided from transcriptome sequencing of several tissues. To estimate gene flow between the two species and identify factors involved in reproductive isolation, we studied their evolutionary history, identified genomic rearrangements, detected signatures of selection on non-coding RNA, and on protein-coding genes.Here we show that gene flow was primarily unidirectional from L. bilineata to L. viridis after their split at least 1.15 million years ago. We detected positive selection of the non-coding repertoire; mutations in transcription factors; accumulation of divergence through inversions; selection on genes involved in neural development, reproduction, and behavior, as well as in ultraviolet-response, possibly driven by sexual selection, whose contribution to reproductive isolation between these lacertid species needs to be further evaluated.The combination of short and long sequence reads resulted in one of the most complete lizard genome assemblies. The characterization of a diverse array of genomic features provided valuable insights into the demographic history of divergence among European green lizards, as well as key species differences, some of which are candidates that could have played a role in speciation. In addition, our study generated valuable genomic resources that can be used to address conservation-related issues in lacertids
    corecore